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Abstract. We propose a method of detecting non-self-correcting infor¬ 
mation cascades in experiments in which subjects choose an option se¬ 
quentially by observing the choices of previous subjects. The method 
uses the correlation function C(t) between the first and the t+ 1-th sub¬ 
ject’s choices. C(t) measures the strength of the domino effect, and the 
limit value c = limt-^oo C(t) determines whether the domino effect lasts 
forever (c > 0) or not (c = 0). The condition c > 0 is an adequate 
condition for a non-self-correcting system, and the probability that the 
majority’s choice remains wrong in the limit t —> oo is positive. We apply 
the method to data from two experiments in which T subjects answered 
two-choice questions: (i) general knowledge questions ( T avg = 60) and 
(ii) urn-choice questions (T = 63). We find c > 0 for difficult questions 
in (i) and all cases in (ii), and the systems are not self-correcting. 
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1 Introduction 

Herding phenomena are ubiquitous in human and animal behavior |1I2| . An 
example is an information cascade, in which a person observes others’ choices and 
chooses the majority’s choice even though the person’s private signal contradicts 
it OH. It is a rational behavior for people who are uncertain about choosing. 
If an information cascade occurs, the same mechanism applies to later decision¬ 
makers, and the majority’s choice tends to prevail. In some cases, the successive 
choices are wrong, and the cascade leads to irrational herding behavior [5]. 

An experimental setup demonstrates a situation in which an information 
cascade occurs [61. There are two urns, A and B, and urn A (B) contains two a 
( b ) balls and one b (a) ball. In each run of the experiment, an urn is randomly 
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chosen initially and called X. Then, the subjects guess whether urn X is A or B 
and choose sequentially. They get a reward for the correct choice. In the course 
of the experiment, each subject draws a ball from X, which is his private signal. 
If the ball is a (b), urn X is more likely to be A (B). He also observes the choices 
of the previous subjects. If the difference between the numbers of subjects who 
choose each urn exceeds two, the private signal cannot overcome the majority’s 
choice. An information cascade starts if someone chooses the majority’s choice 
although his private signal suggests the minority’s one. As the probability that 
the first two persons both choose the wrong option is non-zero, the probability 
for the onset of a cascade where the majority’s choice is wrong is positive. 

We now consider whether the wrong cascade continues [5]. If it continues 
forever, the majority’s choice converges to the wrong option. Information cas¬ 
cades were initially considered to be fragile phenomena. As the trigger of the 
cascade is a small imbalance, people can be dissuaded from following the ma¬ 
jority’s choice |3]. In addition, an agent model with a Bayesian update of the 
private belief showed that the information cascade is self-correcting [5]. As the 
number of agents tends toward infinity, the wrong cascade disappears, and the 
majority’s choice converges to the optimal option. 

Using an information cascade experiment with a general knowledge two- 
choice quiz, we have shown that a phase transition occurs between a one-peak 
phase and a two-peak phase [5j. If the questions are easy, the ratio z(i) of the 
correct choices of t subjects converges to a value z+ > 1/2 in the limit t —» oo. 
As there is only one peak in the probability distribution function of z{t ) 1 we 
call the corresponding phase the one-peak phase 110111] , If the questions are 
difficult and most people do not know the answers, z(t) converges to z + > 1/2 
or Z- < 1/2. One cannot predict the value in {z+,Z-} to which z(t) converges. 
We call the corresponding phase the two-peak phase. In the two-peak phase, 
the wrong cascade does not necessarily disappear, and the system is not self- 
correcting. 

It was recently shown that the limit value of the normalized correlation func¬ 
tion is the order parameter of the phase transition m- The normalized correla¬ 
tion function shows how the first subject’s choice propagates to later subjects. It 
provides a measure of the domino effect. In addition, the positiveness of the limit 
value is a sufficient condition for a non-self-correcting system. By extrapolating 
the results for a finite system to infinity, we can determine whether the system 
is self-correcting. We report on the application of the method to data from two 
types of information cascade experiments. In section 2, we define the normalized 
correlation function. We also explain the behavior of the function in each phase 
and the extrapolation method used to estimate its limit. We present the results 
of the data analysis in section 3. Section 4 summarizes the results. 


2 Correlation function and asymptotic behaviors 

We consider a typical information cascade experiment. T subjects answer a two- 
choice question sequentially in each run. We denote the order of the subjects as 




Detection of non-self-correcting nature of information cascade 


3 


t, where t = 1,2, ■ • • , T. We denote the choice of subject t by X(t) G {0,1}, t = 
1, 2, ••• , T. If the choice is true (false), X(t) takes 1 (0). 




Fig. 1. Response function q(z) vs. z. Left panel shows the one-peak phase, in which 
there is one solution, z+, for z = q(z). Right panel shows the two-peak phase, in which 
there are three solutions, z_ < z u < z+, for z = q(z). 


The correlation function C(t) is defined as the covariance between X(l) and 
X(t + 1) divided by the variance of X(l): 

C(t) = Cov(X(l),X(t+l))/Var(X(l)). 

C(t ) can be expressed as the difference of two conditional probabilities. 

C(t) = Pr (X(t + 1) = 1|X(1) = 1) - Pr (X(t + 1) = 1|X(1) = 0). (1) 

C(t) shows the degree to which the first subject’s choice is transmitted to later 
subjects. It is a measure of the domino effect in an information cascade. 

C(t) is generally positive, and its asymptotic behavior depends on the phase 
of the system and the shape of the response function q(z). Here q(z) represents 
the dependence of the probability of the correct choice by subject t + 1 on the 
ratio z(t) of the correct choices of the previous t subjects. 

1 . 1 

q(z) = Pr(X(t+ 1) = l\z{t) = z) , z(t) = - ^ X(s). 

6 S—1 

With the definition of q(z), the stochastic process {X(t)},t = 1,2- • • becomes 
a generalized Polya urn process m- If there is one solution for z = q(z ) at z+ 
(left panel in FiglU), z(t) converges to z + . C(t) shows power-law decay for large 
t with two constants, c' and l, as 

C{t) ~ d ■ t 1 - 1 l < 1. 

Here, l is the exponent for the power-law decay and is less than 1. The value 
of l is given by g'(z+) IlSIllj . If there are three solutions for z = q(z ) at 
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Z- < z u < z+ (right panel in FigJTJ), the system is in the two-peak phase; 
lim^oo z(t) = z+ or [T^J- The limit value c = lim^oo C{t) is positive, and 
the first subject’s choice propagates to an infinite number of later subjects [14] . 
C(t) behaves asymptotically as 

C(t) ~c + c' (2) 

Here c' • is the subleading term of C(f), and l is given by the larger value 
among {g'(z + ), g'(z— )}. Further, c acts as an order parameter of the phase 
transition, and eq. © is the general asymptotic behavior of C[t) [15]. 

As it is difficult to estimate c using c = lim^oo C ( t ) with empirical data, 
where the system size and number of samples are strictly limited, we introduce 
two quantities for the estimation. First, we define the n— th moment m n (t) for 
C(t) as m n [t) = C(s)(s/t) n . We define the integrated correlation time 

r(t) as r(t) = mo(t). We also define the second moment correlation time £(f) 
as £(f) = t- \J m 2 (t)/?no )• Using the asymptotic behavior of C(t), we estimate 
the subsequent asymptotic behavior of r(t)/t and £(t)/t. 


r(f)/f ~ c+ y • t l 1 

(3) 


(4) 


As r{t)/t is defined as the summation of C{s) over 0 < s < t divided by f, the 
standard error becomes smaller than that of C(t). The asymptotic behavior of 
r(t)/t in eq.© provides a more reliable estimate of c and l than the fitting of 
C(t) to eq.©. also provides a reliable estimate for l [15]- If c > 0, the 

leading term of C{t) is the constant c, and / should be interpreted as l = 1. 

We define whether the system is self-correcting according to whether z(t) 
always converges to z+. In the one-peak (two-peak) phase, the system is (non- 
)self-correcting. If c > 0, the system is in the two-peak phase and is non-self- 
correcting. However, c = 0 does not necessarily mean that the system is self- 
correcting. For the system to be self-correcting, q{z) = z has to have only one 
solution, z .|_. 


3 Domino effect and detection of non-self-correcting 
nature 

We study the domino effect and non-self-correction in information cascades. We 
discuss two types of information cascade experiments. 

In experiment 1 (EXP-I), subjects answered a general knowledge two-choice 
quiz. First, the subjects answered using only their own knowledge. Then, they 
observed the choices of previous subjects and answered the question again. The 
average length of the sequence of subjects is T = 60, and the number of choice 
sequences is 240. The choice sequences are classified into four bins according 
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to the ratio of correct choices z$(T) of the first answers without observation as 
zq(T) = 50%±5%, 60%±5%, 70%±5%, and 80%±5%, and the number of samples 
in each bin is 38(50% ± 5%), 52(60% ± 5%), 38(70% ± 5%), and 38(80% ± 5%), 
respectively [16]. 

Experiment 2 (EXP-II) is similar to the situation explained in the Intro¬ 
duction. There are two urns, A and B, which contain a and b balls in different 
configurations. We use two configuration patterns: (i) two a balls and one b ball 
in urn A vs. one a ball and two b balls in urn B and (ii) five a balls and four b balls 
in urn A vs. four a balls and five b balls in urn B. Urn X £ {A.B} is chosen at 
random at the beginning of each run, and subjects are asked to choose between 
A or B. Each subject draws one ball from X and checks whether it is a or b. 
The ball corresponds to the type of urn X with probability q = 2/3(5/9) for (i) 
[(ii)]. In addition, the subject also observes the choices of previous subjects. Our 
results, unlike those of previous experiments wm, show the summary statistics 
of the number of subjects who have chosen each urn. The length T and number 
of questions / are 63 and 200, respectively, for q £ {2/3, 5/9} [17] . 

We denote the choice sequences in each bin as {X(i,t)},i = 1, ,I,t = 
1, • ■ ■ Here, the length of the sequence depends on question i in EXP-I; we 

denote it as T(i). The number of samples / also depends on the bins. In EXP-II, 
T(i) = 63, and I = 200. First, we estimate C[t) and its standard error AC(t ) 
using eq. ©• We denote the estimate and standard error of the probabilities as 
q x (t + 1) = Pr(X(f + 1) = 1|X(1) = x) and Aq x (t + 1), respectively. They are 
estimated from experimental data {A'(«,f)} as 


1 + Y^i =1 ^ 

N x + 2 


Qx(t + 1) 


I 




Here, we use the expectation value and standard deviation obtained from the 
posterior probability distribution for the probabilities. C(t) is then estimated as 


C(t) = qi{t + 1) - q 0 (t + 1). 


The error bars of C(t) are given as 


AC{t) = \JAqi(t + l) 2 + Aq 0 (t -I- l) 2 . 

Using C(t) and AC(t), we estimate the error bars of m n (t ) as 



( 5 ) 
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EXP-I : Quiz 



Fig. 2. C(i) vs. t for EXP-I. The sample choice sequences are classified according to the 
value of z 0 (T) as z 0 (T) = 50%±5%(B), 60%±5%(O), 70%±5%(A), and 80%±5%(v)- 
We plot only data with the interval At = 5. To see the behavior clearly, we slightly 
shift the data horizontally. 


Here we assume that AC(s) and AC (s') are independent of each other if s ^ s'. 
We estimate the error bars of r t ( t ) and ( t ) as 

Ar t = | Am 0 (t ), 

A(;t = V^t(Am 2 (t)/2m 2 (t) + Am 0 (t)/2m 0 (t)). (6) 

In the estimation of Z\£ t , we assume that Ani 2 (t) and Amo are completely 
correlated. 

3.1 EXP-I: General knowledge quiz case 

Figure [2] plots C(t) vs. t. The value of C(t) generally decreases from its initial 
value of 1 with increasing t. Because the sample number is restricted, AC(t) is 
large. We see that for difficult questions with zq(T) = 50% ± 5% and 60% ± 5%, 
C(t) is positive for large values of t. On the other hand, for easy questions with 
Zo(T) = 70%±5% and 80%±5%, C(t) decreases to zero with increasing t. These 
results suggest that the system is in the two-peak phase for difficult questions. 
For zo(T) = 70%±5% and 80%±5%, an analysis of q(z ) showed that the system 
was in the one-peak phase [Ill- 

Figure [3] shows plots of £( t)/t and r(t)/t vs. t. The standard errors for £(f)/f 
are larger than those for r(t)/t because £(f) is calculated with the second moment 
m 2 ( t). For large values of t,t;(t)/t takes i/l/3 for difficult questions with zo(T) = 
50% ± 5% and 60% ± 5%. The results suggest that the system is in the two-peak 
phase. For easy questions with zq(T) = 70% ± 5% and 80% ± 5%, £(t)/t ~ 0.5 


















Detection of non-self-correcting nature of information cascade 


7 


EXP-I : Quiz 



EXP-I : Quiz 



Fig. 3. i(t)/t and r(t)/t vs. t for EXP-I with the interval At = 5. We also plot the 
fitted results for 


for large values of t. As £(i)/t ~ \fl/l + 2, l ~ 0.7 for easy questions. As l is 
smaller than 1, the system is in the one-peak phase. 

As the system is considered to be in the two-peak phase for z 0 (T ) = 50%±5% 
and 60% ± 5%, we assume r(t)/t = c + d ■ and estimate c, l, d using the least 
square fit. We find that c = 0.297(2) for Zq(T) = 50% ± 5% and c = 0.26(1) for 
zq(T) = 60% ± 5%. For zq{T) = 70% ± 5% and 80% ± 5%, we assume r(t)/t = 
d-t l ~ x and estimate l and d. We find that l = 0.43(1) for zq(T) = 70% ±5% and 
l = 0.35(1) for zo(T) = 80% ± 5%, which differ slightly from the value of l ~ 0.7 
estimated from £,{£)/1. 

3.2 EXP-II: Urn choice case 

Figured] shows plots of C(t), and r(t)/t vs. t for q £ {2/3, 5/9}. As the 

number of samples is larger than that in EXP-I, the standard errors are smaller 


























8 


S. Mori et al. 


EXP-II : Urn 



EXP-II : Urn EXP-II : Urn 




Fig. 4. (7(f), £(t)/t, and r(f)/t vs. t for EXP-II. We use the symbol D(O) for q = 
2/3(5/9). We plot only data with the interval At = 4. To see the behavior clearly, we 
slightly shift the data horizontally. 


than the symbols’ size for r(i)/i and large t. We see that C(t) is positive for 
large values of t for both cases of g, where q £ {2/3, 5/9}. In addition, £(t)/f for 
large values of t converges to \J 1/3, and the exponent l for C(t) ~ t l ~ l is almost 
one. These results suggest that the system is in the two-peak phase for both 
values of q. We assume r(t)/t = c + d ■ i l ~ l and estimate c, l, d using the least 
square fit. We find that c = 0.261(1) for q = 2/3 and c = 0.207(1) for q = 5/9. 


4 Conclusion 

We studied the self-correcting nature of information cascades. We proposed the 
use of the normalized correlation function C(t), which shows how the first sub¬ 
ject’s choice is propagated to later subjects and measures the strength of the 
domino effect in information cascades, c = liuq^oo C(t) > 0 is a sufficient condi¬ 
tion for a non-self-correcting information cascade. In this case, the domino effect 
continues infinitely. The system is in the two-peak phase, and the probability 








Detection of non-self-correcting nature of information cascade 


9 


that z(t ) converges to Z- < 1/2 is positive. We used data from two types of in¬ 
formation cascade experiment: EXP-I, which used a general knowledge quiz, and 
EXP-II, which used urns. The accuracy q of the private signal is q G {2/3,5/9} 
in EXP-II. We estimate C(t) and its integrated quantities r(t) and £(i). In EXP- 
1, when the questions were difficult, c > 0. In EXP-II, c > 0 for both cases of q 
where q £ {2/3, 5/9}. In these cases, the system is non-self-correcting. 

We focus on the study of the non-self-correcting nature of information cas¬ 
cades. Although c > 0 is a sufficient condition for a non-self-correcting cascade, 
c = 0 is not a sufficient condition for a self-correcting cascade. To verify this, one 
should study the response function q(z) and count the number of solutions for 
0 = q(z). Alternatively, it is necessary to study the limit value of the variance 
of z(t). If there is only one solution, z + > 1/2, or the limit value is zero, the 
system is self-correcting. In EXP-I, we studied these points and concluded that 
the system is self-correcting for zq(T) = 70% ± 5% and 80% ± 5% [16]. Our 
experiment for EXP-II and its analysis are under way El- 
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